Automatic multi-modal dialogue scene indexing

نویسنده

  • A. Aydin Alatan
چکیده

An automatic algorithm for indexing dialogue scenes in multimedia content is proposed. The content is segmented into dialogue scenes using the state transitions of a hidden Markov model (HMM). Each shot is classified using both audio and visual information to determine the state/scene transitions for this model. Face detection and silence/speech/music classification are the basic tools which are utilized to index the scenes. While face information is extracted after applying some heuristics to skin-colored regions, audio analysis is achieved by examining signal energy, periodicity and zero crossing rate (ZCR) of the audio waveform. The simulation results show the possibility of automatically indexing the dialogues using the proposed algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative analysis of hidden Markov models for multi-modal dialogue scene indexing

A class of audio-visual content is segmented into dialogue scenes using the state transitions of a novel hidden Markov model (HMM). Each shot is classi ed using both audio track and visual content to determine the state/scene transitions of the model. After simulations with circular and left-to-right HMM topologies, it is observed that both are performing very good with multi-modal inputs. More...

متن کامل

Multi-modal Multi-label Semantic Indexing of Images Based on Hybrid Ensemble Learning

Automatic image annotation (AIA) refers to the association of words to whole images which is considered as a promising and effective approach to bridge the semantic gap between low-level visual features and high-level semantic concepts. In this paper, we formulate the task of image annotation as a multi-label multi class semantic image classification problem and propose a simple yet effective m...

متن کامل

Multi-modal Video Summarization Using Hidden Markov Models for Content-based Multimedia Indexing

MULTI-MODAL VIDEO SUMMARIZATION USING HIDDEN MARKOV MODELS FOR CONTENT-BASED MULTIMEDIA INDEXING Yaşaroğlu, Yağız MSc., Department of Electrical and Electronics Engineering Supervisor: Associate Professor A. Aydın Alatan September 2003, 75 pages This thesis deals with scene level summarization of story-based videos. Two different approaches for story-based video summarization are investigated. ...

متن کامل

Multi-modal recording, analysis and indexing of poster sessions

A new project on multi-modal analysis of poster sessions is introduced. We have designed an environment dedicated to recording of poster conversations using multiple sensors, and collected a number of sessions, to which a variety of multi-modal information is annotated, including utterance units for individual speakers, backchannels, nodding, gazing, and pointing. Automatic speaker diarization,...

متن کامل

Multimodal corpora for human-machine interaction research

In recent years human-machine interaction has increased its importance. One approach to an ideal human-machine interaction is develop a multi-modal system behaves like human-beings. This paper introduces an overview on multimodal corpora which are currently developed in Japan for the purpose. The paper describes database of 1)Multi-modal interaction, 2)Audio-visual speech, 3)Spoken dialogue wit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001